Method and apparatus for estimating vanish points from an image, computer program and storage medium thereof

ABSTRACT

The present invention discloses a method and apparatus for estimating vanish points from an image, computer program and storage medium thereof. One of the method for detecting the vanishing points from an image according to the present invention comprising a dividing step for dividing the image into small patches; a first detecting step for detecting each patch&#39;s local orientations; a composing step for composing lines of pencils based on the local orientations detected in the first detecting step; and a first computing step for computing at least one vanishing point based on the lines of pencils composed in said composing step. On the basis of the vanishing points found by the present invention, the perspective rectification on a document image can be executed accurately and fastly.

FIELD OF THE INVENTION

The present invention relates, in general, to a method for autoperspective rectification. More particularly, the present inventionrelates to a method and apparatus for estimating vanish points from animage (e.g. a document image), computer program and storage mediumthereof.

BACKGROUND OF THE INVENTION

Document scanner is widely used to capture text and transform it intoelectric form for further processing. As the camera resolution rises inrecent years, text capture through the digital cameras is becoming analternative choice. Digital cameras are portable and offer face-up,non-contact, near-instantaneous image acquisition, but suffer from imagequality problems resulting from the wide range of conditions in whichthey may operate. One of the most severe problems is that the camerasshoot documents with arbitrary perspectives and bring perspectivedistortions to captured images. The presence of perspective isdistracting to human readers and makes image analysis operations, suchas optical character recognition (OCR), layout analysis and compression,slower and less reliable.

Thus, it is desirable to automatically correct the perspective distortedimage to produce an upright view of the text regions.

Although the geometry of rectification is fairly mature, such as thosemethods proposed by R. M. Haralick in “Monocular vision using inverseperspective projection geometry: analytic relations, Proceedings of theIEEE Computer Vision and Pattern Recognition Conference 1989; 370-378”,few rectification techniques have been reported in the literature forperspectively distorted document images captured through digitalcameras. In the “Recognizing text in real scenes, International Journalof Document Analysis and Recognition 4 (4) (2002) 243-257” disclosed byP. Clark and M. Mirmehdi, the quadrilaterals formed by the bordersbetween the background and plane where text lies are utilized to get anupright view of perspectively distorted text. After the extraction ofquadrilaterals using the perceptual grouping method, a bilinearinterpolation operation is implemented to construct the correcteddocument image. As the algorithm depends heavily on the extraction ofquadrilateral, the existence of the high-contrast document border (HDB)within the captured document image is a must for correct rectification.

Instead of using document borders that do not always exist in realscene, M. Pilu has proposed a new rectification approach in “Extractionof illusory linear clues in perspectively skewed documents, Proceedingsof the IEEE Computer Vision and Pattern Recognition Conference 2001;363-368” based on the extraction of illusory clues. To extract thehorizontal clues, the character or group of characters is transformedinto blob first and a pairwise saliency measure is computed for pairs ofneighboring blobs, which indicates how likely they belong to one textline. After that, a network based on perceptual organization principlesis transversed over the text and horizontal clues are calculated as thesalient linear groups of blobs. Though working well on the extraction ofhorizontal clues, the method cannot extract enough vertical information.

In the “Perspective estimation for document images, Proceedings of theSPIE Conference on Document Recognition and Retrieval IX 2002; 244-254”proposed by C.R. Dance, distorted document image is rectified using twoprincipal vanishing points, which are estimated based on the parallellines extracted from the text lines and the vertical paragraph margins(VPM). The main drawback of this approach is that it works only on fullyaligned text as it relies heavily on the existence of VPM features. Inaddition, the means to extract the parallel lines also is not clarified.

In the “Rectifying perspective views of text in 3D scenes usingvanishing points, Pattern Recognition 36 (2003) 2673-2686” disclosed byP. Clark and M. Mirmhedi, two vanishing points are estimated based onsome paragraph formatting (PF) information. More specifically, thehorizontal vanishing point is calculated based on a novel extension of2D projection profile and the vertical vanishing point based on some PFinformation such as VPM or text line spacing variation when paragraphsare not fully aligned. However, to implement such a rectificationmethod, well-formatted paragraphs are required.

Nowadays, several applications that can rectify the perspectivedistorted document image have been brought on the market, for example,Casio EX-Z55 and Wintone Huishi. However, both of them are based on HDBextraction, and the results are not reliable due to lack of enoughborder information.

SUMMARY OF THE INVENTION

In view of the above situation, the object of the present invention isto automatically correct the perspective distorted image to produce anupright view of the text regions.

To achieve the above stated objects, according to an aspect of thepresent invention, there is provided a method for detecting thevanishing points from an image, comprising: a dividing step for dividingthe image into small patches; a first detecting step for detecting eachpatch's local orientations; a composing step for composing lines ofpencils based on the local orientations detected in the first detectingstep; and a first computing step for computing at least one vanishingpoint based on the lines of pencils composed in said composing step.

To achieve the above stated objects, according to an aspect of thepresent invention, there is provided another method for detecting thevanishing points from an image, comprising: a second detecting step fordetecting an edge of the image and forming an edge image; an extractingstep for extracting the text baseline from said edge image and forming atext baseline image; and a finding step for finding the horizontalvanishing point from said text baseline image.

According to one preferred embodiment, a vertical vanishing point isfurther located on the basis of the horizontal vanishing point obtainedby the above stated method.

Furthermore, a method for perspective rectification in a document imageis provided on the basis of the above obtained vanishing points.

To achieve the above stated objects, according to another aspect of thepresent invention, there is provided an apparatus for detecting thevanishing points from an image, comprising: a dividing means fordividing the image into small patches; a first detecting means fordetecting each patch's local orientations; a composing means forcomposing lines of pencils based on the local orientations detected bythe first detecting means; and a first computing means for computing atleast one vanishing point based on the lines of pencils composed by saidcomposing means.

To achieve the above stated objects, according to another aspect of thepresent invention, there is provided another apparatus for detecting thevanishing points from an image, comprising: a second detecting means fordetecting an edge of the image and forming an edge image; an extractingmeans for extracting the text baseline from said edge image and forminga text baseline image; and a finding means for finding the horizontalvanishing point from said text baseline image.

According to one preferred embodiment, a vertical vanishing point isfurther located on the basis of the horizontal vanishing point obtainedby the above stated apparatus.

Furthermore, an apparatus for perspective rectification in a documentimage is provided on the basis of the above obtained vanishing points.

Computer program for implementing the above said method of extractingtext from document image with complex background is also provided.

In addition, computer program products in at least one computer-readablemedium comprising the program codes for implementing the above saidmethod of extracting text from document image with complex backgroundare also provided.

In can be seen that different from the above mentioned methods whichdepend heavily upon document borders (DB) or paragraph format (PF), thepresent invention detects the vanishing point from local spectra'sorientation information of textural area and edge information ofcharacters. The rectification matrix can then be derived from thedetected vanishing points. Neither document borders nor paragraph formatinformation is needed. The algorithm according to the present inventioncan handle document images with figures and graphics as well, such asmathematical equations.

Other objects, features and advantages of the present invention will beapparent from the following description when taken in conjunction withthe accompanying drawings, in which like reference characters designatethe same or similar parts throughout the drawings thereof.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute apart of the specification, illustrate embodiments of the invention and,together with the description, serve to explain the principles of theinvention. In the drawings:

FIG. 1 shows the block diagram of a computer system, which may be usedwith the present invention;

FIG. 2 shows the flow chart for detecting the vanishing points from adocument image according to the first embodiment of the presentinvention;

FIG. 3 is a flow chart showing the method for analyzing the localorientation of each patch in the original image;

FIG. 4 a shows one of the exemplary texture patch taken from theoriginal image;

FIG. 4 b shows the result after spectra filtering on the texture patchshown in FIG. 4 a;

FIG. 4 c shows the result of computing the filtered patch's spectrashown in FIG. 4 b by FFT;

FIG. 4 d shows the pruning result on the filtered spectra shown in FIG.4 c;

FIG. 5 a shows a two-dimension independent component analysisillustration;

FIG. 5 b shows the spectra orientation result by using the two-dimensionindependent component analysis shown in FIG. 5 a;

FIG. 5 c shows the texture orientation result by using the two-dimensionindependent component analysis shown in FIG. 5 a;

FIG. 6 is a flow chart illustrating the process of adaptively pencilprune according to the first embodiment of the present invention;

FIG. 7 a shows the original pencil for non parallel situation;

FIG. 7 b shows the pruned pencil with respect to the original pencil fornon parallel situation shown in FIG. 7 a according to the firstembodiment of the present invention;

FIG. 7 c shows the original pencil for parallel situation;

FIG. 7 d shows the pruned pencil with respect to the original pencil forparallel situation shown in FIG. 7 c according to the first embodimentof the present invention;

FIG. 8 illustrates an example for vanishing point estimation in atexture document image, wherein FIG. 8 a shows the original localhorizontal pencil and FIG. 8 b shows the original local vertical pencil,FIG. 8 c shows the original image and FIG. 8 d shows the image withvanishing points labeled;

FIG. 9 shows the flow chart of the method for detecting the vanishingpoints from a document image according to the second embodiment of thepresent invention;

FIG. 10 shows the flow chart of the method for extracting text baselinesaccording to the second embodiment of the present invention;

FIG. 11 shows the angle enlargement effect of horizontal compression;

FIG. 12 a shows the X-compressed edge image obtained by the method forextracting text baselines according to the second embodiment of thepresent invention;

FIG. 12 b shows the text line image extracted by the method forextracting text baselines according to the second embodiment of thepresent invention;

FIG. 12 c shows the baseline image extracted by the method forextracting text baselines according to the second embodiment of thepresent invention;

FIG. 12 d shows sub images and their skews according to the secondembodiment of the present invention;

FIG. 13 shows the cross angle between the line drawn from a sub image ofFIG. 12 d and the line that passes through the rough location ofhorizontal vanishing point and the center of the sub image according tothe second embodiment of the present invention;

FIG. 14 shows the relationship between search space C and R² accordingto the second embodiment of the present invention;

FIG. 15 shows the scan process in the search space according to thesecond embodiment of the present invention;

FIG. 16 shows an original perspective distorted image;

FIG. 17 illustrates the result of HVP according to the second embodimentof the present invention;

FIG. 18 shows the relationships between the parameters for locating thevertical vanishing points according to the second embodiment of thepresent invention;

FIG. 19 to FIG. 25 give the results of perspective rectification byapplying the proposed method according to the second embodiment, whereinFIG. 19 shows a perspective distorted image, FIG. 20 illustrates thedetected horizontal vanishing point HVP and all the horizontal lines arederived from the same point, i.e. the HVP, FIG. 21 shows an image blockcropped from the edge image before removing edges that do not belong tothe vertical strokes, FIG. 22 shows an image block cropped from the edgeimage after removing the edges that do not belong to the verticalstrokes, FIG. 23 shows the detected line segments (vertical strokes),FIG. 24 illustrates the detected horizontal vanishing point HVP andvertical vanishing point VVP and all the horizontal lines are derivedfrom the HVP and all the vertical lines are derived from the VVP, andFIG. 25 shows the perspective rectified image according to the secondembodiment of the present invention; and

FIG. 26 shows a document entry system based on digital camera that themethod for detecting the vanishing points from a document imageaccording to the first and second embodiment of the present inventioncan be applied.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

In the following detailed description, numerous specific details are setforth in order to provide a thorough understanding of the invention.However, it will be appreciated by one of ordinary skill in the art thatthe present invention shall not be limited to these specific details.

Firstly, an example of the computer system that can implement thepresent invention will be described with reference to FIG. 1.

The method of the invention may be implemented in any image processingdevice, for example, a personal computer (PC), a notebook, or asingle-chip microcomputer (SCM) embedded in a camera, a video camera, ascanner, and etc. To a person skilled in the art, it would be easy torealize the method of the invention through software, hardware and/orfirmware. It should be particularly noted that, to implement any step ofthe method or any combination of the steps, or any combination of thecomponents, it is obvious for a person skilled in the art that it may benecessary to use I/O device, memory device, microprocessor such as CPU,and the like. The following descriptions and the method of the presentinvention will not necessarily mention such devices, although they areactually used.

As the image processing device mentioned above, the block diagramillustrated in FIG. 1 shows one example of a typical computer system,which may be used with the present invention. Note that while FIG. 1illustrates various components of a computer system, it is not intendedto represent any particular architecture or manner of interconnectingthe components, as such details are not germane to the presentinvention. It will also be appreciated that network computers and otherdata processing systems, which have fewer components or perhaps morecomponents, may also be used with the present invention.

As shown in FIG. 1, the computer system, which is a form of a dataprocessing system, includes a bus 101 that is coupled to amicroprocessor 102 and a ROM 104 and volatile RAM 105 and a non-volatilememory 106. The microprocessor 102, which may be a Pentiummicroprocessor from Intel Corporation, is coupled to cache memory 103 asshown in the example of FIG. 1. The bus 101 interconnects these variouscomponents together, and also interconnects these components 103, 104,105, and 106 to a display controller and display device 107 and toperipheral devices such as input/output (I/O) devices, which may bemouse, keyboards, modems, network interfaces, printers, and otherdevices that are well known in the art. Typically, the input/outputdevices 109 are coupled to the system through input/output controllers108. The volatile RAM 105 is typically implemented as dynamic RAM(DRAM), which requires power continuously in order to refresh ormaintain the data in the memory. The non-volatile memory 106 istypically a magnetic hard drive or a magnetic optical drive or anoptical drive or a DVD RAM or other type of memory system, whichmaintains data even after the power is removed from the system.Typically, the non-volatile memory will also be a random access memory,although this is not required. While FIG. 1 shows that the non-volatilememory is a local device coupled directly to the rest of the componentsin the data processing system, it will be appreciated that the presentinvention may utilize a non-volatile memory which is remote from thesystem, such as a network storage device which is coupled to the dataprocessing system through a network interface such as a modem orEthernet interface. The bus 101 may include one or more buses connectedto each other through various bridges, controllers, and/or adapters, asis well known in the art. In one embodiment, the I/O controller 108includes a USB (Universal Serial Bus) adapter for controlling USBperipherals.

Next, the embodiments of methods for estimating vanish points from adocument image according to the present invention will be explained indetail by referring to the accompanying drawings.

Before illustrating the concrete embodiments of the present invention,the technical terms used in the present invention will be brieflysummarized in the following table.

ICA Independent Component Analysis FFT Fast Fourier Transform DC DirectCurrent Pencil The set of all lines through a point. For detailedinformation, please refer to the following web page onhttp://mathworld.wolfram.com/Pencil.html Pencil prune Delete the worstline in a pencil one by one according to lines' quality until the pencilquality is good enough Pencil analysis Find the vanishing point of apencil. Vanishing point Vanishing point is defined as the convergencepoint of lines in an image plane that is produced by the projection ofparallel lines in real space. For detailed information, please refer tothe following web page onhttp://mathworld.wolfram.com/VanishingPoint.html HVP In the embodimentsdescribed in the present invention, horizontal vanishing point (HVP) isthe convergence point of horizontal lines (for example, text baselines). VVP In the embodiments described in the present invention,vertical vanishing point (VVP) is the convergence point of verticallines (for example, vertical strokes, justified paragraph borders). “OR”“OR” means that for the N to 1 mapping from original compression imageto compressed image, if there's at least a black pixel in the N pixels,then the pixel on the compressed image is set as black. Text BaselineThe text baseline is a continuous or discontinuous line constituted bythe base line (for example, European languages) or bottom line forexample, East Asian languages) of each character in the compressedimage. Rectification According to the geometry of rectification methodmatrix proposed by R.M. Haralick in “Monocular vision using inverseperspective projection geometry: analytic relations, Proceedings of theIEEE Computer Vision and Pattern Recognition Conference 1989; 370-378”,a 3*3 rectification matrix can be derived from the HVP and VVP. It isthe inverse of the distortion matrix. By using the rectification matrix,an up-right view can be easily recovered from the distorted image.

Now the present invention will be described in connection with theaccompanying drawings by adopting the above defined technical terms.Please be noted that the related technical term mentioned in thefollowing description and the claims will be generally explain as themeanings defined in the above table, unless otherwise being speciallyexplained.

First Embodiment

FIG. 2 shows the flow chart of the method for detecting the vanishingpoints from a document image according to the first embodiment of thepresent invention.

As shown in FIG. 2, firstly at step 100, set some sample points in theoriginal image, for example, evenly select 8×8 points. Each sample pointis associated with a template size of small patch (small area), such as64, 128 or 256, which should be appropriate for FFT (Fast FourierTransform). After the process in step 100, the original image is dividedinto some small patches (i.e. some small areas are extracted from theoriginal image), and each patch is centered by one of the sample points.

Then, in step 200, each patch's local orientations are analyzed. It ishard to analyze a texture patch's orientations directly. Generally theyare estimated by spectra. The present invention is also based on thisidea, but the method is to analyze spectra by ICA (Independent ComponentAnalysis). Step 200 consists of four sub-procedures as illustrated inFIG. 3.

FIG. 3 is a flow chart showing the method for analyzing the localorientation of each patch in the original image. It should be noted thatthe following description is directed to the exemplary texture patchtaken from the original image shown in FIG. 4 a.

As shown in FIG. 3, in step 210 a spectra filter (for example, Hanningfilter) is used to preprocess the image patch as shown in FIG. 4 a byconvolution so as to obtain a smooth spectral response. The preprocessresult of the spectra filter is illustrated in FIG. 4 b. FIG. 4 b showsthe result after spectra filtering on the texture patch shown in FIG. 4a.

Then, in step 220, it is to compute the filtered patch's spectra shownin FIG. 4 b by FFT, and shift to a symmetrical one. The computing resultis illustrated in FIG. 4 c. FIG. 4 c shows the result of computing thefiltered patch's spectra shown in FIG. 4 b by FFT.

Since it is cumbersome to analyze original spectra directly, the presentinvention hereby prunes the spectra for optimization while keeping theoriginal spectra's structural information. In step 230, the spectra ispruned by reserving only first n (e.g., n=template size) largest spectracomponents. It should be noted that DC (Direct Current) component isalso deleted. The result is illustrated in FIG. 4 d. FIG. 4 d shows thepruning result on the filtered spectra shown in FIG. 4 c.

Thereafter, the pruned spectra are analyzed by ICA algorithm of thepresent invention in Step 240.

Generally speaking, the ICA algorithm according to the present inventionconsists of three steps: 1) centering, 2) whitening, 3) maximizing anobjective function.

For a spectra image X, each point on the spectra image has twocoordinates x and y, and the spectra value for each point is defined asthat sample point's probability p. C_(x) is the covariance of X. Here,the central point is defined as the origin for centering.

Whitening is to search for a transformation V, such that s.t. Y=VX iswhite. That is, to make the Y's covariance as an identity matrix. Here

$\begin{matrix}{V = {\Lambda^{\frac{1}{2}}{\Theta^{T}.}}} & (1)\end{matrix}$

where Λ is eigenvalues of C_(x) (diag), and Θ is eigenvectors of C_(x)in columns.

For two-dimension situation, ICA is simplified to a rotation variable R,i.e. only one variable, such that the PDF (Probability DistributionFunction) of output S=RY is as different as possible from the Gaussianfunction. Here:

$\begin{matrix}{{R = \begin{bmatrix}{\cos \; \theta} & {{- \sin}\; \theta} \\{\sin \; \theta} & {\cos \; \theta}\end{bmatrix}},} & (2)\end{matrix}$

where θε[−45°,45°],

A θ is searched such that the non-Gaussian value is maximum. The mostcommonly used non-Gaussian criterion is Kurtosis, which is defined as:

κ(x)=E[X ⁴]−3(E[X ²])².  (3)

Kurtosis is zero for a Gaussian random variable, thus the θcorresponding to the maximum absolute Kurtosis value is searched. Anexample is illustrated in FIG. 5 a.

The observed signals are x=AS, thus:

A=(RV)⁻¹,  (4)

where A's column vectors represent two independent orientations, asillustrated in FIG. 5 b. FIG. 5 b shows the spectra orientation resultby using the two-dimension independent component analysis shown in FIG.5 a. It should be noted that spectra's orientations are orthogonal tooriginal patch's orientations. The results are illustrated in FIG. 5 c.FIG. 5 c shows the texture orientation result by using the two-dimensionindependent component analysis shown in FIG. 5 a.

In addition, it should be noted that the above described step 200 alsoimplicit a partial ICA algorithm, as illustrated following.

Equation (4) can compute two local independent orientationssynchronously. In some situations, it is known in advance oneorientation, then the Equation (4) can be used to compute anotherorientation by searching θ, such that the difference between the knownorientation in advance and one of the computed orientation is minimum.Thereby, another orientation can be obtained. This method is very fastonly if one orientation is known in advance.

Now, returning back to FIG. 2, after analyzing the local orientations ofeach patch with the above stated method, the process is forward to step300, in which the pencils are constructed by the sample points and localorientations of the patches. Here, the pencil means the set of all linesthrough a vanishing point.

Since each patch has two orientations, they can be easily classified bytheir slopes. Accordingly, two lines can be drawn for each patch bytheir local orientations and associated sample points. These lines arerepresented as r=x cos θ+y sin θ. All these lines can be simplycategorized as being “vertical” or “horizontal”. Each group of lines issupposed to intersect at one vanishing point, thus form a pencil.

Thereafter, the process is forward to step 400 to perform adaptivepencil line pruning process (adaptive pencil line deletion process).

Since pencil lines are estimated in real situation, some noisy lines maybe included in a pencil. Some noisy lines of the pencil should be pruned(deleted) for better results. Step 400 is to prune these noisy linesadaptively and the details are explained as following.

If three lines (r_(i),θ_(i)), (r_(j),θ_(j)) (r_(k),θ_(k)) are parallelor intersect at one point, there exists following relationship:

r _(i) sin(θ_(j)−θ_(k))+r _(j) sin(θ_(k)−θ_(i))+r _(k)sin(θ_(i)−θ_(j))=0.  (5)

This formula is very simple and easy to prove, just by definition. Thusthe method to prune noisy lines adaptively according to the presentinvention is based on this formula.

For each line (r_(i),θ_(i)) in a pencil, a line quality is defined bythe following equation:

$\begin{matrix}{{LineQ}_{j} = {\sum\limits_{j,k}{{{{r_{i}{\sin \left( {\theta_{j} - \theta_{k}} \right)}} + {r_{j}{\sin \left( {\theta_{k} - \theta_{i}} \right)}} + {r_{k}{\sin \left( {\theta_{i} - \theta_{j}} \right)}}}}.}}} & (6)\end{matrix}$

Smaller LineQ_(i) means better line quality.

In addition, a pencil's quality is defined as:

$\begin{matrix}{{{PencilQ} = \frac{\sum\limits_{i}{LineQ}_{i}}{N{r_{0}}}},} & (7)\end{matrix}$

where N is the pencil lines amount, r₀ belongs to line with minimumLineQ, and it is used here for normalization. PencilQ's value ismeaningful only if it is normalized as in Equation (7). Equation (7) canbe seen as an intrinsic metric for pencils, since it is irrespective tocoordinate origin, translation, scale and rotation.

PencilQ should be very small for any good pencil. After setting goodpencil's quality threshold as PencilQ_(Th) (for example,PencilQ_(Th)=0.5), a pencil's quality can be computed. If it is biggerthan PencilQ_(Th), delete the worst line (N=N−1) which has the maximumLineQ, and recalculate the pencil's quality, until it is less thanPencilQ_(Th) FIG. 5 illustrates this process. In real situation, pencilmay have two states, parallel or non parallel. Both can be first prunedby above procedure. After pencil quality is good enough, parallelsituation can be easily differentiated, such as by angles variances.

The sub-procedures in Step 400 are illustrated in FIG. 6. FIG. 6 is aflow chart illustrating the process of adaptively pencil line pruning.

As shown in FIG. 6, first at step 410, each line's quality LineQ iscomputed.

Then, in step 420, each pencil's quality is computed by combing Equation(6) and Equation (7).

After computing each pencil's quality, the pencil's quality is comparedwith pre-defined threshold, to judge if the pencil is good enough.

If the pencil's quality is larger than the threshold, the process isforward to step 430, the pencil is pruned, and each line's quality iscomputed again. In step 430, since the pencil is not good enough, thepencil is pruned by deleting the worst line according to lines' quality.

The above steps are repeated until the pencil quality is good enough,i.e. abnormal lines are all removed.

The results of adaptive pencil prune are illustrated in FIG. 7. FIG. 7 ashows the original pencil for non parallel situation, and FIG. 7 b showsthe pruned pencil with respect to the original pencil for non parallelsituation shown in FIG. 7 a according to the first embodiment of thepresent invention. In addition, FIG. 7 c shows the original pencil forparallel situation, and FIG. 7 d shows the pruned pencil with respect tothe original pencil for parallel situation shown in FIG. 7 c accordingto the first embodiment of the present invention.

Now returning back to FIG. 2 again, after adaptively pruning thepencils, the process is forward from step 400 to step 500. In step 500,the vanishing point is computed with the following derived formula.

Each pencil corresponds to a vanishing point. However, there is still noreliable method to compute such vanishing point. The present inventionhereby proposes a new method by a new derived formula, and the detailsare explained as following.

If a pencil (a series of lines (r_(i),θ_(i)) where iε[1, N]) is obtainedin a perspective distorted image, and it is supposed to have a vanishingpoint (x₀,y₀). For any line (r,θ) passing (x₀,y₀), the followingequation (8) can be obtained:

r=x ₀ cos θ+y ₀ sin θ.  (8)

For any two lines (r_(i),θ_(i)) and (r_(j),θ_(j)) in this pencil, theyand line (r,θ) must satisfy Equation (5).

An objective function E is defined as:

$\begin{matrix}{{E = {\sum\limits_{i,j}\left( {{r\; {\sin \left( {\theta_{i} - \theta_{j}} \right)}} + {r_{i}{\sin \left( {\theta_{j} - \theta} \right)}} + {r_{j}{\sin \left( {\theta - \theta_{i}} \right)}}} \right)^{2}}},} & (9)\end{matrix}$

here, E can be seen as an overall measure of the goodness of fit of thevanishing point to the pencil. E≧0 (=0 only in ideal situation, saying,pencil intersects exactly at one point), and E is minimum only if (r,θ)passes this pencil's supposed vanishing point. Based on this analysis,the following equation (10) can be derived:

$\begin{matrix}{{\frac{\partial E}{\partial r} = 0},} & (10)\end{matrix}$

if (r,θ) passes vanishing point.

Combing Equation (9) and Equation (10), equation (11) as following canbe obtained:

$\begin{matrix}{r = {\frac{\sum\limits_{i,j}{{\sin \left( {\theta_{i} - \theta_{j}} \right)}\left\lbrack {{r_{i}{\sin \left( {\theta_{j} - \theta} \right)}} + {r_{j}{\sin \left( {\theta - \theta_{i}} \right)}}} \right\rbrack}}{- {\sum\limits_{i,j}{\sin^{2}\left( {\theta_{i} - \theta_{j}} \right)}}}.}} & (11)\end{matrix}$

By collating Equation (11) and Equation (8), the vanishing point (x₀,y₀)can be estimated as:

$\begin{matrix}{\begin{bmatrix}x_{0} \\y_{0}\end{bmatrix} = {{\frac{- 1}{\sum\limits_{i,j}{\sin^{2}\left( {\theta_{i} - \theta_{j}} \right)}}\begin{bmatrix}{\sum\limits_{i,j}{{\sin \left( {\theta_{i} - \theta_{j}} \right)}\left( {{r_{i}\sin \; \theta_{j}} - {r_{j}\sin \; \theta_{i}}} \right)}} \\{\sum\limits_{i,j}{{\sin \left( {\theta_{i} - \theta_{j}} \right)}\left( {{r_{j}\cos \; \theta_{i}} - {r_{i}\cos \; \theta_{j}}} \right)}}\end{bmatrix}}.}} & (12)\end{matrix}$

Equation (12) is the proposed vanishing point estimation methodaccording to the present invention. FIG. 8 illustrates an example forvanishing point estimation in a texture document image, wherein FIG. 8 ashows the original local horizontal pencil and FIG. 8 b shows theoriginal local vertical pencil, FIG. 8 c shows the original image andFIG. 8 d shows the image with vanishing points labeled.

Second Embodiment

As another embodiment of the present invention, vanishing points arelocated by analyzing edge information of characters, which is differentfrom the above described first embodiment of the present invention. Themain steps according to the method of the second embodiment of thepresent invention are shown in FIG. 9.

FIG. 9 shows the flow chart of the method for detecting the vanishingpoints from a document image according to the second embodiment of thepresent invention.

As shown in FIG. 9, first in step 9100, it is to perform detect edgesand remove non-text edges.

To facilitate edge detection, color image and BW image are firstlyconverted into their gray-scale representation. Then edges are detectedwith Sobel edge detector followed by the non-maximum suppression.

The sensitivity threshold (ST) for the Sobel edge detector is computedautomatically from the histogram of gradient magnitudes by using Otsu'sthreshold method. The edge detector ignores all edges that are notstronger than the sensitivity threshold ST.

After edge detection, it is to perform connective components (CC)analysis on the edge image to remove non-text edges. Connectivecomponents analysis is basically based on connective components' sizeand aspect ratio. If the size of a connective component is too large ortoo small, or the connective component has a large aspect ratio (in thiscase, it is possibly a line), the connective component is classified asa non-text connective component. And all edges belong to the non-textconnective component is removed from the edge image.

Then, in step 9200, the text baselines are extracted. Most of edges inthe edge image now belong to characters. The horizontal vanishing point(HVP) can be estimated based on the parallel lines extracted from thetext alignment information, such as text baselines. In the presentinvention, text baselines are extracted by using the methods in shownFIG. 10.

While extracting text baselines, it needs to get a major direction ofbaseline image for later processing and HVP finding. This majordirection is the rough text line orientation detected by the skewdetection on the original gray image. The nearest neighbor based method,such as the method disclosed by C.R. Dance in “Perspective estimationfor document images, Proceedings of the SPIE Conference on DocumentRecognition and Retrieval IX 2002; 244254”, is used to detect the roughskew angle of the document image. This angle is regarded as the textline orientation. A substituted method is to generate several baselineimages in different pre-given orientations, e.g. 0, −30, 30 and 90degrees, and choosing the best one which has the best continuity andlinearity.

After the skew detection, the edge image is rotated by the determinedrough skew angle or a certain pre-given orientation in step 1001.

Then in step 1002 the rotated image is compressed along X-direction by“OR” method. The compression ratio should be variable according to thecharacter size or image size. This kind of anisotropic “OR” compressioncould bring two benefits. Firstly, close characters and close words willbe connected into text lines, such as the X-compressed edge image shownin FIG. 12 a. Secondly, the distortion or skewness will be enlarged.Thus it makes the distortion detection much easier, as shown in FIG. 11,which shows the angle enlargement effect of horizontal compression.

In order to get more continuous baselines, the well-known RLSA (RunLength Smoothing Algorithm) operation is performed along X direction onthe compressed image to connect words into lines in step 1003. Thethreshold for minimum run length is set as 4. And then in step 1004spaces (small “holes”) in and between characters (words) are filled byfinding and analyzing white connected components. The result can beshown as in FIG. 12 b which shows the text line image. Thereafter, instep 1005, by retrieving the end point of black runs along Y direction,the end points of black runs along Y direction are extracted as baselinepoints, as shown in FIG. 12 c which shows the baseline image.

After extracting the text baselines according to the above describedmethod as shown in FIG. 10, the process returns to step 9300 shown inFIG. 9.

In step 9300, the horizontal vanishing point is located. The process isperformed by firstly locating the rough position of HVP in step 9300-1and then finding the accurate position of HVP in the neighbor area ofrough HVP in step 9300-2, which will be detailed described in thefollowing description.

Firstly in step 9300-1, the rough HVP is located.

After the baseline image is got, it is divided into M=2*N sub images.

Here, M is an empirical value, and N is the compression ratio defined instep 9300. And for each sub image I_(i), an average skew θ_(i) isdetermined, and the maximum premium W_(i), which is defined as thesquare sum of projection profile, also is determined by projectionprofile based method. Here, it is assumed that there only exists skewdistortion in the sub images. The perspective distortion in these subimages is so small that it can be ignored. FIG. 12 d shows the subimages and their skews.

Then the rough location of HVP is calculated according to those multipleskews of sub images. A line can be drawn from the center ( x _(i), y_(i)) of each sub image, with the angle θ_(i) and the weight W_(i).Given the set of lines L≡{L_(i), i=0, . . . , M−1}, recursively grouptwo lines, get the intersection point (x₀,y₀), and calculate thefollowing functional:

${f\left( {x_{0},y_{0}} \right)} = {\sum\limits_{i = 0}^{L - 1}\left( W_{i} \middle| {\beta_{i} < {\Delta\beta}} \right)}$

Where, β_(i) (as shown in FIG. 13 which illustrates the β_(i)) is thecross angle between line L_(i) and the line that passes through (x₀, y₀)and ( x _(i), y _(i)). Δβ defines a small range tolerance on the crossangle. f(x₀,y₀) discloses how many weighted lines passing through(x₀,y₀). The (x₀,y₀) in which having the maximum f(x₀,y₀) is chosen asthe rough location of horizontal vanishing point.

Then, in step 9300-2, the accurate HVP is located.

In the end, another approach based on projection profiles of thebaseline image is used to find the accurate HVP. The combination ofthese two approaches makes HVP location less computation costs andhigher accuracy.

A circular search space C as illustrated in FIG. 14 is used. FIG. 14shows the relationship between search space C and R². Each cell c=(r,θ),θ≦r<1 and 0≦θ<360°, in the space C corresponds to a hypothesizedhorizontal vanishing point V=(V_(r),V_(θ)) on the image plane R², withdistance V_(r)=R₀[r/(1−r)] from the center of the image, and angleV_(θ)=θ. R₀ is the radius of image. This maps the infinite plane R²exponentially into the finite search space C. A projection profile isgenerated for each hypothesized HVP in C, except those lying within theimage region itself. And a projection profile η is a set of bins {η_(i),i=0, 1, 2, . . . }.

In perspective transformation, all points on the image plane which havethe same angle connecting to the horizontal vanishing point must be onthe same horizontal line of real scene. So pixels with different anglesare projected to different bins in the projection profile.

Compare the projection profiles mapped from each hypothesized HVP, andpick out the most qualified projection profile. Its corresponding HVP isthe required point.

Here, the search space is smaller than C because the rough HVP has beenfound. Its angle range is set as ±4 and distance range is only about onefifth of the whole distance range.

Simple hierarchical approach is used for the search process. An initial2-dimension scan of the search space at a low resolution is performedand one winning HVP which has the maximum square sum of projectionprofile is picked out. Then on the region around the winning HVP, a fullresolution 2-dimension scan is performed and the accurate HVP is foundin the end.

For low resolution, the angle pace is 0.5 degree and the distance rangeis divided into 8 equal parts. For full resolution, the angle pace anddistance pace is one sixth of that in low resolution scan.

In the initial scan step, if the distance of rough HVP is large enough,the symmetrical angle of rough HVP may be also considered.

Because of different compression ratio in X-direction and Y-direction,two dimensions in X-baseline images are not of isotropy. To keep thescan uniform, it needs to generate hypothesized HVP in the search spaceof original image.

Each scan flows as in FIG. 15. FIG. 15 shows the scan process in thesearch space.

As shown in FIG. 15, at first in step 1501, the winning (rough) HVP ismapped from the X-baseline image to the original image.

Then, in step 1502, all the hypothesized HVPs on the original image areobtained in the search space, and are mapped from the original image toX-baseline image in step 1503.

Next, in step 1504, the projection profile of the X-baseline image isgenerated from each hypothesized HVP and its square sum is calculated.Finally in step 1505, the winning HVP whose projection profile has themaximum square sum is found and determined as the accurate HVP.

In the analysis of projection profile, according to its peaks andvalleys, the projection profile is divided into text lines (i.e.projection profile segments). Then check these segments, and accumulatethe square sums of all valid segments. It is regard as the square sum ofthe whole projection profile.

An original perspective distorted image is shown in FIG. 16 and FIG. 17illustrates the result of HVP. All the horizontal lines are derived fromthe same point, i.e. HVP. Lots of evaluations on kinds of documentimages show that the method proposed in the present invention is of highaccuracy. And it takes less than several hundreds of milliseconds on oneimage.

Now returning back to FIG. 9, after locating the horizontal vanishingpoint (HVP) as explained in the above description, the process isforward to step 9400, in which edges that don't belong to verticalstrokes are removed so as to detect the vertical strokes better.

Edges which do not belong to vertical strokes are then removed bycomparing the gradient direction and HVP line direction. HVP line is theline which passes through the current edge and HVP. For each edgei(x_(i),y_(i)), the gradient direction is computed by:

${\tan \; \theta_{i}} = \frac{{Gy}_{i}}{{Gx}_{i}}$

Where, Gx_(i) and Gy_(i) are the gradient along X direction and Ydirection.

HVP line direction is computed by:

${\tan \; \beta_{i}} = \frac{{vy} - y_{i}}{{vx} - x_{i}}$

Where, (vx,vy) is the coordinate of HVP. If the cross angle of the twolines, Δθ_(i)=|θ_(i)−β_(i)| is greater than a given threshold Δθ, theedge is removed from the edge image. This is equivalent to comparing|tan Δθ_(i)| and |tan Δθ|. |tan Δθ_(i)| is computed as:

${{\tan \; \Delta \; \theta_{i}}} = {{{\tan \left( {\theta_{i} - \beta_{i}} \right)}} = {\frac{{\tan \; \theta_{i}} - {\tan \; \beta_{i}}}{1 + {\tan \; \theta_{i}\tan \; \beta_{i}}}}}$

After removing the edges that do not belong to the vertical strokes instep 9400, the process is forward to step 9500, in which the linesegments associated with vertical strokes are detected.

The vertical stroke candidates are found by finding connectivecomponents on the processed edge image. For the purpose of calculationof vanishing point, only dominant connective components, whose length isin certain range (12<L<150), are considered.

The line segments associated with vertical strokes are obtained byfitting a line parameterized by an angle θ and distance from imageorigin ρ:

ρ=x cos θ+y sin θ

Each obtained connective component is a list of edge pixels(x_(i),y_(i)) with the similar gradient orientation. The line parametersare directly determined from the eigenvalues λ₁ and λ₂ and eigenvectorsv₁ and v₂ of the matrix D associated with the edge pixels.

$D = \begin{bmatrix}{\sum\limits_{i}{\overset{\sim}{x}}_{i}^{2}} & {\sum\limits_{i}{{\overset{\sim}{x}}_{i}{\overset{\sim}{y}}_{i}}} \\{\sum\limits_{i}{{\overset{\sim}{x}}_{i}{\overset{\sim}{y}}_{i}}} & {\sum\limits_{i}{\overset{\sim}{y}}_{i}^{2}}\end{bmatrix}$

Where {tilde over (x)}_(i)=x_(i)− x and {tilde over (y)}_(i)=y_(i)− yare the mean corrected pixels coordinates belonging to a particularconnective component and

$\overset{\_}{x} = {{\frac{1}{n}{\sum\limits_{i}{x_{i}\mspace{14mu} {and}\mspace{14mu} \overset{\_}{y}}}} = {\frac{1}{n}{\sum\limits_{i}{y_{i}.}}}}$

In the case of an ideal line, one of the eigenvalues should be zero.

The quality of the line fit is characterized by the ratio of the twoeigenvalues of matrix D,

$v = {\frac{\lambda_{1}}{\lambda_{2}}.}$

The line parameters are determined from the eigenvectors v₁,v₂, where v₁is the eigenvector associated with the largest eigenvalue. Theparameters of the line ρ=x cos θ+y sin θ are then computed as:

$\theta = {a\; \tan \; \left( \frac{v_{1}(2)}{v_{1}(1)} \right)}$ρ= x cos θ+ y sin θ

Where ( x, y) is the mid-point of the line segment.

After detecting the segments associated with the vertical strokes instep 9500, the process is forward to step 9600, in which the verticalvanishing points (VVP) is Located.

The line detection stage in step 9500 gives a set of line segmentsL={L₁,l=0, . . . , L−1}. The purpose of this step is to locate the mostoptimal convergent point, namely VVP, from the detected line segments. Astatistical approach is used to search for VVP. The approach consists ina minimization of the following functional:

$\min\limits_{x_{0},y_{0}}{= {\sum\limits_{i}{W_{i}\left( {\sin \; \beta_{i}} \right)}^{2}}}$$W_{i} = \frac{v_{i}}{V}$ ${\sin \; \beta_{i}} = \frac{d_{i}}{r_{i}}$d_(i) = ρ_(i) − x₀cos  θ_(i) − y₀sin  θ_(i)$r_{i} = \sqrt{\left( {x_{0} - {\overset{\_}{x}}_{i}} \right)^{2} + \left( {y_{0} - {\overset{\_}{y}}_{i}} \right)^{2}}$

Where, v_(i) is the length of the i^(th) line segment, while v is thetotal length of all line segments. ( x _(i), y _(i)) is the mid-point ofthe line segment. d_(i) is the distance of the vanishing point (x₀,y₀)from the line segment i. r_(i) is the distance of the vanishing pointand the line segment center. FIG. 18 shows the relationship of the abovementioned parameters.

Here, it does not try to search the accurate vertical vanishing point(x₀,y₀) in the whole image plane, but in the collections of allintersection points of the line segments, which will greatly reducecomputation load. However, if the number of line segments is great(>1000), searching in all intersection points is still time-consuming.Thus the following method is used to reduce the number of intersectionpoints to be searched.

Step 9601: Project ( x _(i), y _(i)), the coordinates of line segmentcenters onto the line L which passes through the image center and thehorizontal vanishing point (HVP).

Step 9602: Select the 25% most left line segments into Group 1. Selectthe 25% most right line segments into Group 2.

Step 9603: Select 100 longest line segments from Group 1 and select 100longest line segments from Group 2.

Step 9604: Search in the intersection points of the selected linesegments chosen in step 9603.

After locating the horizontal vanishing point in step 9300 and locatingthe vertical vanishing point in step 9600, the rectification matrix isbuilt in step 9700. A 3*3 rectification matrix can be derived from HVPand VVP, according to the well-known geometry of rectification.

FIG. 19 to FIG. 25 give the results of perspective rectification byapplying the proposed method according to the second embodiment, whereinFIG. 19 shows a perspective distorted image, FIG. 20 illustrates thedetected horizontal vanishing point HVP and all the horizontal lines arederived from the same point, i.e. the HVP, FIG. 21 shows an image blockcropped from the edge image before removing edges that do not belong tothe vertical strokes, FIG. 22 shows an image block cropped from the edgeimage after removing the edges that do not belong to the verticalstrokes, FIG. 23 shows the detected line segments (vertical strokes),FIG. 24 illustrates the detected horizontal vanishing point HVP andvertical vanishing point VVP and all the horizontal lines are derivedfrom the HVP and all the vertical lines are derived from the VVP, andFIG. 25 shows the perspective rectified image according to the secondembodiment of the present invention.

The above described method for detecting the vanishing points from adocument image according to the first and second embodiment of thepresent invention can be used in a document entry system based ondigital cameras, such as illustrated in FIG. 26.

FIG. 26 shows a document entry system based on digital camera that themethod for detecting the vanishing points from a document imageaccording to the first and second embodiment of the present inventioncan be applied.

As shown in FIG. 26, in step 2601 the document page is shot by a digitalcamera. Then, in step 2602, the perspective distortion contained in thedocument page shot by the digital camera is corrected by the abovedescribed method for detecting the vanishing points from a documentimage according to the first and second embodiment of the presentinvention.

Next, in step 2603 the text component can be found in the document pagewith perspective distortion being corrected. After performing opticalcharacter recognition in step 2604, the text in the original documentpage can be output in step 2605.

Besides the above mentioned concrete embodiments of the presentinvention's method and apparatus, the objects of the invention may alsobe realized through running a program or a set of programs on anyinformation processing equipment as described above, which may becommunicated with any subsequent processing apparatus. Said informationprocessing equipment and subsequent processing apparatus may be allwell-known universal equipments.

Therefore, it is important to note that the present invention includes acase wherein the invention is achieved by directly or remotely supplyinga program (a program corresponding to the illustrated flow chart in theembodiment) of software that implements the functions of theaforementioned embodiments to a system or apparatus, and reading out andexecuting the supplied program code by a computer of that system orapparatus. In such case, the form is not limited to a program as long asthe program function can be provided.

Therefore, the program code itself installed in a computer to implementthe functional process of the present invention using computerimplements the present invention. That is, the present inventionincludes the computer program itself for implementing the functionalprocess of the present invention.

In this case, the form of program is not particularly limited, and anobject code, a program to be executed by an interpreter, script data tobe supplied to an OS, and the like may be used as along as they have theprogram function.

As a recording medium for supplying the program, for example, a floppydisk, hard disk, optical disk, magneto optical disk, MO, CD-ROM, CD-R,CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM,DVD-R), and the like may be used.

As another program supply method, connection may be established to agiven home page on the Internet using a browser on a client computer,and the computer program itself of the present invention or a file,which is compressed and includes an automatic installation function, maybe downloaded from that home page to a recording medium such as a harddisk or the like, thus supplying the program. Also, program codes thatform the program of the present invention may be broken up into aplurality of files, and these files may be downloaded from differenthome pages. That is, the present invention also includes a WNW serverthat makes a plurality of users download program files for implementingthe functional process of the present invention using a computer.

Also, a storage medium such as a CD-ROM or the like, which stores theencrypted program of the present invention, may be delivered to theuser, the user who has cleared a predetermined condition may be allowedto download key information that decrypts the program from a home pagevia the Internet, and the encrypted program may be executed using thatkey information to be installed on a computer, thus implementing thepresent invention.

The functions of the aforementioned embodiments may be implemented notonly by executing the readout program code by the computer but also bysome or all of actual processing operations executed by an OS or thelike running on the computer on the basis of an instruction of thatprogram.

Furthermore, the functions of the aforementioned embodiments may beimplemented by some or all of actual processes executed by a CPU or thelike arranged in a function extension board or a function extensionunit, which is inserted in or connected to the computer, after theprogram read out from the recording medium is written in a memory of theextension board or unit.

What has been describes herein is merely illustrative of the applicationof the principles of the present invention. For example, the functionsdescribed above as implemented as the best mode for operating thepresent invention are for illustration purposes only. As a particularexample, for instance, other design may be used for obtaining andanalyzing waveform data to determine speech. Also, the present inventionmay be used for other purposes besides detecting speech. Accordingly,other arrangements and methods may be implemented by those skilled inthe art without departing from the scope and sprit of this invention.

1. A method for detecting at least one vanishing point from an image,the method comprising: a dividing step for dividing the image into aplurality of patches; a first detecting step for detecting each patch'slocal orientations; a composing step for composing lines of pencilsbased on the local orientations detected in said first detecting step;and a first computing step for computing at least one vanishing pointbased on the lines of pencils composed in said composing step.
 2. Amethod according to claim 1, further comprising a first pruning step forpruning a noisy line from the lines of pencils composed in saidcomposing step, and wherein, in said first computing step, the vanishingpoint is computed based on the lines of pencils after the noisy line waspruned in said first pruning step.
 3. A method according to claim 1,wherein the local orientations are detected based on local spectraanalysis of the patches in said first detecting step.
 4. A methodaccording to claim 3, wherein said first detecting step comprises: apreprocessing step for preprocessing the patches with a spectra filter;a second computing step for computing the patch's spectra by processingthe preprocessed patches with FFT; a second pruning step for adaptivelypruning the patch's spectra obtained in said second computing step; andan estimating step for estimating the local orientations by independentcomponent analysis of the spectra.
 5. A method according to claim 4,wherein the spectra filter used in said preprocessing step is a Hanningfilter.
 6. A method according to claim 4, wherein, in said secondpruning step, the spectra are pruned by deleting the directed currentcomponent and reserving the first “n” largest spectra component, whereinthe “n” equals a template size.
 7. A method according to claim 1,further comprising: a second detecting step for detecting an edge of theimage and forming an edge image; an extracting step for extracting thetext baseline from said edge image; and a finding step for finding ahorizontal vanishing point based on the text base line.
 8. A methodaccording to claim 7, further comprising: a removing step for removingedges, that do not belong to vertical strokes, from the edge image; aline segment detecting step for detecting line segments of verticalstrokes from the edge image removed in said removing step; and avertical vanishing point calculating step for calculating the verticalvanishing point based on the line segments detected in said line segmentdetecting step.
 9. An apparatus for detecting at least one vanishingpoint from an image, the apparatus comprising: a dividing unit fordividing the image into a plurality of patches; a first detecting unitfor detecting each patch's local orientations; a composing unit forcomposing lines of pencils based on the local orientations detected bysaid first detecting unit; and a first computing unit for computing atleast one vanishing point based on the lines of pencils composed by saidcomposing unit.
 10. A computer-readable storage medium storing acomputer program, wherein the computer program enables a computer toexecute: a dividing step for dividing the image into a plurality ofpatches; a first detecting step for detecting each patch's localorientations; a composing step for composing lines of pencils based onthe local orientations detected in said first detecting step; and afirst computing step for computing at least one vanishing point based onthe lines of pencils composed in said composing step.
 11. A method fordetecting at least one vanishing point from an image, the methodcomprising: a second detecting step for detecting an edge of the imageand forming an edge image; an extracting step for extracting the textbaseline from said edge image; and a finding step for finding ahorizontal vanishing point based on the text base line.
 12. A methodaccording to claim 11, further comprising: a removing step for removingedges, that do not belong to vertical strokes, from the edge image; aline segment detecting step for detecting line segments of verticalstrokes from the edge image removed in said removing step; and avertical vanishing point calculating step for calculating the verticalvanishing point based on the line segments detected in said line segmentdetecting step.
 13. An apparatus for detecting at least one vanishingpoint from an image, the apparatus comprising: a second detecting unitfor detecting an edge of the image and forming an edge image; anextracting unit for extracting the text baseline from said edge image;and a finding unit for finding a horizontal vanishing point based on thetext base line.
 14. A method according to claim 13, further comprising:a removing unit for removing edges, that do not belong to verticalstrokes, from the edge image; a line segment detecting unit fordetecting line segments of vertical strokes from the edge image removedby said removing unit; and a vertical vanishing point calculating unitfor calculating the vertical vanishing point based on the line segmentsdetected by said line segment detecting unit.